target-specific and selective drug design
CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models
The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme that uses guidance from attribute predictors trained on latent features. To generate novel and optimal drug-like molecules for unseen viral targets, CogMol leverages a protein-molecule binding affinity predictor that is trained using SMILES VAE embeddings and protein sequence embeddings learned unsupervised from a large corpus. We applied the CogMol framework to three SARS-CoV-2 target proteins: main protease, receptor-binding domain of the spike protein, and non-structural protein 9 replicase. The generated candidates are novel at both the molecular and chemical scaffold levels when compared to the training data. CogMol also includes insilico screening for assessing toxicity of parent molecules and their metabolites with a multi-task toxicity classifier, synthetic feasibility with a chemical retrosynthesis predictor, and target structure binding with docking simulations.
Review for NeurIPS paper: CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models
Weaknesses: The benchmarking of the molecular VAE model does not include a null model so as to assess its performance compared to random sampling of chemical space. The results show that generating high-affinity ligands is more challenging for NSP9 but the authors provide no reasoning or discussion as to why this may be. Could this be an artifact of the available training data in regards to its size and range of affinities? In the section on novelty, the authors mention using Tanimoto similarity between molecular fingerprints but do not delineate the specific algorithm and parameters used for fingerprint generation. Previous studies have demonstrated that the calculated similarities between molecules can vary significantly between fingerprinting methods.
Review for NeurIPS paper: CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models
This paper proposes a framework, called CogMol, to design a drug-like small molecule for specific targets, which was applied to the problem of designing molecules that bind to three proteins found in SARS-CoV-19. Reviewers raised various concerns and questions and author response largely resolved major criticisms. Overall, based on the technical novelty, experiments, and clarity in writing, this paper passes the bar of acceptance to NeurIPS as a technical paper. However, multiple reviewers expressed a concern about the possibility that readers over-interpret the results in the context of the current pandemic situation, because wet-lab validation experiments have not been performed, (which would be out of scope and not necessary for a ML conference paper.) Thus, it is strongly recommended that the authors revise the manuscript to explicitly state that no experimental validation has been performed and only in-silico binding conclusions can be drawn.
CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models
The novel nature of SARS-CoV-2 calls for the development of efficient de novo drug design approaches. In this study, we propose an end-to-end framework, named CogMol (Controlled Generation of Molecules), for designing new drug-like small molecules targeting novel viral proteins with high affinity and off-target selectivity. CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme that uses guidance from attribute predictors trained on latent features. To generate novel and optimal drug-like molecules for unseen viral targets, CogMol leverages a protein-molecule binding affinity predictor that is trained using SMILES VAE embeddings and protein sequence embeddings learned unsupervised from a large corpus. We applied the CogMol framework to three SARS-CoV-2 target proteins: main protease, receptor-binding domain of the spike protein, and non-structural protein 9 replicase.